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Abstract 


In  this  paper  we  consider  nonhomogeneous  autoregressive  processes 
which  are  special  cases  of  the  vector-valued  autoregressive  processes 
considered  by  Anderson  (1978)  for  the  analysis  of  panel  survey  data. 

We  point  out  that,  for  a  nonhomogeneous  autoregressive  process  of  order 
higher  than  one,  the  least-squares  estimates  cannot  be  obtained  unless 
repeated  measurements  are  made  on  the  time  series.  We  present  here  two 
Bayesian  approaches  based  on  Kalman  filter  models  which  alleviate  the 
above  difficulty  and  result  in  an  alternative  strategy  for  the  analyses  of 
nonhomogeneous  autoregressive  processes.  In  our  first  approach  the 
notion  of  exchangeability  plays  a  key  role,  whereas  for  our  second 
approach,  which  results  in  an  adaptive  Kalman  filter  model,  an  approxima¬ 
tion  due  to  Lindley  facilitates  the  necessary  computations  for  inference. 


Nonhomogeneous  autoregressive  processes,  random  coefficient 
autoregressive  processes,  exchangeability,  adaptive  Kalman 
filtering,  panel  survey  data,  cross-section  studies, 

Lindley Ts  approximation 
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1.  Introduction  and  Overview 


To  keep  the  introduction  simple,  we  shall  focus  attention  on  a 
first-order  autoregressive  process  of  the  form 


y  =  0  v  ^  -  +  u  a=l,...,N;  . 

Jat  at^a,t-l  at  9  9  9 


(1.1) 


The  autoregressive  coefficients  are  assumed  unknown  and  the 

at 

innovations  u  are  assumed  to  be  independent  and  normally  distributed 
with  a  known  mean  and  variance.  When  0=0  for  all  a,  (1.1)  will  be 

(XL  L 

referred  to  as  a  nonhomogeneous  (or  inhomogeneous)  autoregressive  process . 

When  0  -  0*  for  all  t,  (1.1)  will  be  referred  to  as  a  random  coefficient 

autoregressive  process .  The  above  nomenclature  is  in  keeping  with  the 
terminology  of  Anderson  (1978)  and  Liu  and  Tiao  (1980), respectively ,  who 
have  written  on  the  above  processes. 

Nonhomogeneous  and  random  coefficient  autoregressive  processes  have 
a  wide  applicability  in  the  analysis  of  economic,  sociological,  biological 
and  industrial  data.  Such  processes  can  be  easily  motivated  in  the  context 
of  "panel  surveys/1  that  is,  surveys  in  which  several  respondents  are 
interviewed  at  more  than  one  point  in  time.  Analyses  of  such  data  are 
sometimes  called  "cross-section  studies"  by  econometricians,  [See  Hsiao  (1986).] 
Anderson  (1978)  cites  several  examples  of  panel  surveys  in  the  economic, 
medical  and  sociological  contexts  and  develops  inference  procedures  for  a  set 
of  several  sequences  of  observations  from  the  same  nonhomogeneous  vector¬ 
valued  process.  The  approach  taken  by  Anderson  (1978)  is  least-squares 
with  an  accompanying  asymptotic  theory.  Liu  and  Tiao  (1980)  address  the 
panel  survey  problem  via  random  coefficient  autoregressive  processes 


which  are  stationary,  that  is,  with  |0  |<1,  and  propose  a  Bayesian  approach 

*5* 

for  inference  about  the  0^fs  .  The  Bayesian  set-up  of  Liu  and  Tiao  (1980) 

A 

assumes  that  the  0^  s  are  independent  drawings  from  a  rescaled  beta 
distribution. 

In  this  paper,  we  present  two  Bayesian  approaches  for  inference  in 
a  nonhomogeneous  autoregressive  process  of  order  p  >  1.  The  process 
considered  by  us  is  a  special  case  of  the  vector -valued  nonhomogeneous 
autoregressive  processes  considered  by  Anderson  (1978) .  A  motivation  for 
the  p-th  order  nonhomogeneous  autoregressive  process  has  also  been  given 
by  Horigome,  Singpurwalla  and  Soyer  (1985)  who  consider  the  problem  of 
monitoring  for  Reliability  growth."  The  data  from  reliability  growth 
problems  can  be  regarded  as  being  the  result  of  a  panel  survey. 

In  Section  2  we  introduce  the  vector-valued  nonhomogeneous  auto¬ 
regressive  process  of  Anderson  (1978)  and  review  the  least  squares 
estimators  of  the  parameters  of  this  process.  We  point  out  that  for 
such  processes  with  p  >  1,  it  is  not  possible  to  obtain  the  least 
squares  estimators  unless  N  is  also  greater  than  one.  We  contrast  this 
with  the  Bayes  estimators  which  do  not  suffer  from  such  restrictions. 

The  set  up  of  Section  3  can  be  cast  as  an  ordinary  Kalman  filter 
model,  whereas  that  of  Section  4  can  be  cast  as  an  adaptive  Kalman  filter 
model.  The  term  adaptive  filtering  is  used  in  the  engineering  literature 
whenever  some  or  all  of  the  parameters  of  the  observation  or  the  state 
equation  of  the  Kalman  filter  are  estimated  from  the  data  [Broemeling  (1985), 
p.  274].  A  review  of  the  different  approaches  to  adaptive  filtering  is 
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given  by  Mehra  (1972).  With  adaptive  Kalman  filtering,  the  automatic 
and  closed  form  nature  of  the  ordinary  Kalman  filter  [cf.  Meinhold  and 
Singpurwalla  (1983)]  is  lost.  Shumway  (1983)  has  considered  maximum 
likelihood  estimation  in  adaptive  Kalman  filtering  using  the  expectation- 
maximization  algorithm  of  Dempster  et.al.  (1977).  A  Bayesian  approach 
to  adaptive  Kalman  filtering  has  been  considered  by  Magill  (1965)  but 
Magillfs  treatment  assumes  that  the  unknown  parameters  of  the  linear 
system  can  only  take  a  finite  number  of  distinct  values.  The  approach 
suggested  by  us  here  does  not  have  such  a  restriction  and  uses  an 
approximation  due  to  Lindley  (1980)  which  enables  us  to  obtain  computable 
results.  Our  use  of  Lindley fs  approximation  for  the  analysis  of  adaptive 
Kalman  filter  models  is  new  and  it  represents  a  contribution,  albeit  a 
minor  one,  to  the  state  of  the  art  of  filtering. 

In  Section  3  we  present  our  first  approach.  The  notion  of 
exchangeability  plays  a  key  role  in  our  development  here  -  it  enables 
us  to  assign  a  structure  of  dependence  for  the  coefficients  of  a 
nonhomogeneous  autoregressive  process  of  order  p  >  2  and  N  >  !•  Such  ■ 
a  structure  of  dependence  alleviates  the  requirement  that  N  be  greater 
than  one. 

In  Section  4  we  present  our  second  approach.  Here  we  confine 
our  attention  to  the  case  p  *  N  =  1,  but  assume  that  the  coefficients 
of  the  nonhomogeneous  autoregressive  process  are  themselves  described 
by  a  homogeneous  autoregressive  process  of  order  one,  with  an  unknown 
coefficient.  Thus  the  structure  of  dependence  of  Section  4  is  stronger 
than  that  of  Section  3,  but  with  p  =  1,  the  model  of  Section  4  is  simpler 
than  that  of  Section  3. 
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2.  Least  Squares  Estimation  in  Nonhomogeneous  Autoregressive 
Processes 

Suppose  that  y  is  an  m-component  column  vector  and  an  m  x  m 

time— variant  matrix  of  coefficients.  Let  {u^}  be  a  sequence  of  mutually 

independent  m-component  vectors,  u^  having  a  normal  distribution  with 

mean  0  and  covariance  matrix  ;  the  index  t=l,2,...,  denotes  time.  A 

first-order  vector-valued  nonhomogen eo us  autoregressive  process  is  of  the 
form 

yt  =  ©t  yt-1  +  ut.  »  (2.1) 

where  Yq  is  assumed  known. 

If  there  are  N  distinct  units  (or  individuals)  in  a  survey,  and 
m-  measurements  are  taken  for  each  unit,  then  we  will  observe  N  different 
time  series.  Thus  for  example,  y  is  an  m-component  vector  of  measure¬ 
ments  on  the  a-th  individual  at  time  t. 

Given  y  ,  a=l,...,N,  and  t=l,...,T,  the  least-squares  estimator 
of  0^,  obtained  by  Anderson  (1978)  is: 

§t  =  Ct(l)  0^(0)  (2.2) 


where 


-  I  I  V  W 


(2.3)  ' 


and  y'  denotes  the  transpose  of  a  column  vector  y. 

Note  that  the  estimators  0  are  based  on  the  pooling  of  informa¬ 
tion  from  all  of  the  N  time  series. 

If  we  extend  (2.1)  to  the  case  of  a  p-th  order  nonhomo geneous 
autoregressive  process,  then 


It  ■  ®lt  Zt-1  +  ~2t~t-2  +--'+  Vt-p  +  St* 


with  Yq,  7-i > • • • ,y-(p_i)  assumed  known. 


(2.4) 
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The  least-squares  estimators  of  the  unknown  elements  of  the  p  unknown 
m  x  m  matrices  are  also  obtained  by  Anderson  (1978) ;  these  are 


(Q^ 9 •  •  •  jQp^)  ""  (ct(l),...,ct(p)) 


c  (1,1). ..C  (l,p) 

•  • 

Ct(p,l)...  Ct’(p,p) 


(2.5) 


where  Cfc(j)  is  given  by  (2.3)  and 


1  N 
CX  ■“  i 


For  the  case  m=N-l,  that  is,  when  we  have  only  one  measurement  per 

item  at  time  t,  say  y  ,  and  only  one  item  to  observe,  then  (0-  ,,..,0  ) 

t  *  -It  ~pt 

simplifies  as  9  where  0  is  a  column  vector  with  elements  (9.  ,.*.,9  .) 

-t  It  pt 

and  the  equation  for  the  least  squares  estimator  of  0^  is 


5t  z2i>  -  re  ?<!r 


(2.6) 


where 


<yt-i’ 


•  >y 


t-p 


)-. 


Note  that  (y^^  y^|)  is  the  outer  product  matrix,  and  is  of  rank  1. 
Thus  when  m=N=l  and  p  >  1,  the  least-squares  estimators  (which  under  this 
set-up  are  also  the  maximum  likelihood  estimators)  of  the  coefficients 
of  the  p-th  order  nonhomogeneous  autoregressive  processes  are  not  uniquely 
defined . 
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For  p  =  1  the  least  squares  estimators  do  of  course  exist  and 
these  take  the  following  simple  and  intuitive  form 

’  yt/yt-r  <2-7) 

In  Section  3  we  shall  obtain  Bayes  estimators  of  (®]_t »  •  •  •  >6pt) 
for  the  case  m=N=l,  and  show  that  these  can  always  be  obtained  and  are 
unique.  It  is  important  to  note  that  in  obtaining  Bayes  estimators  we 
are  incorporating  some  additional  structure  to  the  model,  the  nature  of 
which  will  be  clarified  in  the  sequel.  The  additional  structure  compen¬ 
sates  for  the  lack  of  information  due  to  the  limitation  imposed  by  N 
being  equal  to  one. 

3 .  Bayesian  Estimation  in  Nonhomogeneous  Autoregressive  Processes 
Assuming  Exchangeability  of  Coefficients. 

In  this  section  we  first  consider  the  p-th  order  nonhomogeneous 
autoregressive  process  (2.4)  with  m=N=l  and  discuss  inference  for  0  .  , 

Later  on  we  extend  our  results  to  processes  with  N  >  1.  In  some  applica¬ 
tions  it  may  be  reasonable  to  assume  a  time  pattern  for  the  B^'s;  see 
for  example,  Section  4.  However,  in  most  instances  this  may  not  be  true 
and  what  may  be  reasonable  is  some  form  of  dependence  among  the  vectors 
®15®25***5  -  ^  simple  way  of  describing  such  dependence  is  to  assume 
that  the  sequence  of  column  vectors  0  is  exchangeable;  that  is, 

are  ^nvar^ant  under  permutations.  Exchangeability  describes 
a  mild  form  of  dependence  and  this  is  most  easily  obtained  by  assuming 
that  the  G^'s  are  generated  by  some  multivariate  distribution  G,  indexed 
by  a  vector  of  hyper-parameters  X,  on  which  a  prior  distribution  tt 
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is  assigned. 


It  may  be  of  interest  to  note  that  if  G  is  not  specified  but 
estimated  from  the  data,  then  the  above  set  up  would  be  referred  to  as 
empirical  Bayes ,  whereas  if  G  were  specified  but  the  uncertainty  about 
X  not  described  by  7T  but  instead  X  estimated  from  the  data,  then  the 
above  set  up  would  be  referred  to  as  parametric  empirical  Bayes 
[cf.  Morris  (1983)].  With  both  G  and  tt  completely  specified,  as  we 
propose  to  do  here,  the  above  set  up  would  be  referred  to  as  Bayes 
empirical  Bayes  [Deely  and  Lindley  (1981)]. 

In  this  paper,  we  shall  assume  that  the  G^'s  are  generated  by 

a  multivariate  normal  distribution  with  an  unknown  mean  vector 

X= (A^, . . . Ap)  and  a  known  p  x  p  covariance  matrix  V.  Gur  uncertainty 

about  X  will  also  be  described  by  a  multivariate  normal  distribution 

with  a  mean  vector  m  and  covariance  matrix  s  .  Both  m  and  s  have 

~o  ~o  ~o  ~o 

to  be  specified  initially;  however  upon  the  receipt  of  data  they  will 
be  updated  according  to  Bayes  law.  Thus  to  summarize,  a  proper  Bayesian 
description  of  the  nonhomogeneous  autoregressive  process  considered  by 
us,  goes  as  follows: 


yt  =  ®t  ?t-i 


u  ^  W (0,a  ) 
t  u 


0  ^  M(A,V) 


+  ut  ,  with 


2  . 


,  where  is  specified; 


,  where  V  is  specified,  and 


(3.1) 


X  N  (m  ,  s  ) 
~  o  ~  o 


,  where  and  sq  are  also  specified, 


7 


I  £  l 


The  above  set-up  can  also  be  expressed  as  a  dynamic  linear  model 
in  the  sense  of  Harrison  and  Stevens  (1976)  and  therefore  the  Kalman 
Filter  solution  can  be  used  for  inference  about  0  given  y^,...,yt- 
To  see  this,  we  first  rewrite  (3.1)  as 

yt  =  ~t  ?t-l  +  Ut’  With  Ut  ^  and 


0^  =  X  +  w^,  with  'V*  W(0,V), 


(3.2) 


where  the  u^'s  are  independent  of  the  wt’s,  X  is  independent  of  w^  and 

X  'v  W  (m  ,  s  )  . 

~  ~o  ~o 

To  cast  (3.1)  into  the  format  of  a  Kalman  Filter  model,  we  let 


=  (f)’ 

■•a  ?)•-(?)• 

••a  a. 


where  is  the  p  x  p  identity  matrix  and  is  the  covariance  matrix  of 
,  and  we  rewrite  (3.2)  as 


yt  ‘  !t  !t  +  ut 


5t  *  St  +  v 


(3.3) 


Using  the  well  known  solution  to  the  standard  Kalman  Filter  model 
[see  for  example  Meinhold  and  Singpurwalla  (1983)],  we  have  the  result 
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that  given  y(t)  =  (y1>...,y(;)  and  yQ . y-(p-i) 

(0t|y(t))  ^  W(et>Et),  where 


<!t-i  +  PyS 


st  =  !k  +  2  7  (Pr'  7  -(7) —  (yt  -  ;t-i  ?t-i> 

°u  +  yt-l  (?t-l  +  Pit-1 


(3.4) 


and 


?t  -  (*t-i  +  p  - 


pt-i + Pit-i  it-i  pt-i - 


V) 


a2  +  y  (p)  (s  +  V)v^^ 

°u  it-1  ~t-l  !'?t-l 


with 


(3.5) 


m.  = 


“t-i  +  T 


!t-l  it-1 


S +  z£l  (;t-i +  viZi 


(pT  (yt  - 


m 


u  and 


(3.6) 


St  =  St-1 


„  v(p)  v(pT  _ 

~t-i  ?t-i  Zt-i  St-i 

°U  +  It-. 1  (!t-i  +  y)yt-i 


(3.7) 


Furthermore,  the  posterior  distribution  of  X  given  y(t)  is 


(X  |y (t) )  ^  W(mt,  st) 


where  the  updating  formulas  for  and  st  are  given  above, 
of  0  and  X  given  y(t)  is  given  by 


The  covariance 


cov(0t>  X|y(t))  =  s)__1 


<!t-l  +  V  yS  St-i 

°u + + 


(3.8) 


The  predictive  density  of  yt+^  given  y(t)  is  of  the  form 


(yt+lly(t)>''/  *  ?tP^St+V^  ytP^  +  °u^  • 
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Under  the  assumption  of  a  quadratic  loss,  9^_  and  m^_  are  the  Bayes  estimators 
of  0  and  X.  When  p=l,  the  Bayes  estimator  of  9t  simplifies  to 


§t  =  \  \_-j_  +  <l-irt)  where 


(3.9) 


TT 


U 


t  2  ,  ,2  ,  v  2 

au  +  (av  +  5t-l)yt-l 


,  and 


2 

a  is  the  variance  of  w,  in  (3.2). 

V  ~t 


Thus  for  a  first  order  nonhomogeneous  autoregressive  process  the 

Bayes  estimator  at  time  t  is  a  weighted  average  of  the  prior  mean  of 

0t  (namely  and  the  least  squares  estimate  y  The  weight 

2  2  2 

TT  is  a  function  of  the'  variance  components  a  ,  a  and  s  ,  .  If  a 
t  u  V  t-1  u 

2 

gets  small  or  (a^  +  s  gets  large,  then  gets  small  and  in  (3.9) 

more  weight  is  given  to  the  least  squares  estimator.  We  also  note  that 
the  Bayes  estimator  at  time  t  is  based  on  all  the  available  data  at  time 
t,  whereas  the  least  squares  estimator  is  based  on  y  and  y^_  ^  only. 

A 

As  a  final  comment,  we  note  that  the  Bayes  estimator  0  can  be  obtained 
for  any  order  p  of  the  process,  irrespective  of  the  value  of  N. 

For  the  p-th  order  nonhomogeneous  process  with  m=l  and  N  >  1, 
we  assume,  following  Anderson  (1978),  that  coefficients  0  are  identical 
for  all  cross-sectional  units  and  write  the  model  as 


y  =Y(NXP) 
It  tt-1 


!t  +  V 


(3.10) 


where  It-lp)  =  (?t-l 


.t-r 


yt  )  is  a  Nxp  matrix  and  y^  =  (y 


It  y2t* 


■yNt) 
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The  N-dimensional  vector  u^_  =  (u^  u^t  ...  u^)'  is  assumed  to  be 
normally  distributed  with  mean  vector  0  and  a  specified  variance- 
covariance  matrix,  say  U. 

By  judging  {0^}  as  an  exchangeable  sequence,  by  replacing  y  by 
the  N-dimensional  vector  y  ,  y^|  by  the  (NXp)  matrix  and  u^_ 

by  u^  in  (3.2),  we  can  cast  the  above  model  into  the  framework  of  the 
Kalman  filter.  We  then  appeal  to  the  Kalman  filter  solution,  and  obtain 
the  posterior  distribution  of  0^  given  y(t)  =  (y^,  y^,  • . • ,yt)  as  a  normal 
with  mean 


It  -  St-1 +  <~vi +  P  5l  <*,  -  ?SP)  ;t-i> 


(3.11) 


and  variance 


21  = 


<*t-l  +  v) 


V)  -  (st_1  +  V)  Y®^p)  (;c_1  +  V),  (3.12) 


where 


p  =  y(Nkp)  (  +  V)  +  U  , 

~t  ~  t-1  -t-1  ~  ~  t-1  ~  9 


,  „(Nxp)'  — 1  ,  (Nxp)  ,  , 

?t  ’  Vi  +  Jt-i  Jt-i  ?t  (?t  *?t-i St-P"  and 


s  =  s  -  s  y  (Nxp)  "*p— 1  Y  (Nxp ) 

~t  ~t-l  -t-1  it-1  ~t  Zt-1  ~t-l  ‘ 


We  note  that  m^  and  s^  are  the  posterior  mean  and  covariance  matrix  of  X. 

2 

For  p=l  and  U  =  O  I._  ,  0^  simplifies  to 
u  ~N  ~t 


A  yr-l  yi- 

0  =  tt  m  -  +  (1  -  TT  )  - - - 

c  c  t_1  c  it-i  it- i 


-  a 


where 


*n\ 


°«  +  £-1  ?t-l  (St-l  +  ®v) 


,  and 


?t-i  ?t 

— -  ,  is  the  least-squares  estimate  of  0 

?t-i  ?t-i  ~fc 


Thus  the  Kalman  filter  solution  results  in  a  shrinkage  estimator 
which  is  a  linear  combination  of  the  prior  mean  and  the  least  squares 
estimator. 

4.  Bayesian  Estimation  in  Nonhomogeneous  Autoregressive  Processes 
Assuming  Autoregression  of  Coefficients. 

Consider  the  first  order  nonhomogeneous  autoregression  process  (2.1) 

with  m=N=l  and  assume  a  time  pattern  to  the  0fs,  where  now  0  =  0-  -  0  . 

r  ~t  ~t  It  t 

Specifically,  let  0^  =  +  wt*  where  a  is  unknown  and  the  innovation 

2 

w^  is  normal  with  mean  0  and  known  variance  O  .  Thus  the  model  considered 
t  w 

here  is  of  the  form 


yt  =  0tyt-i  -t 


+  u 


and 


(4.1) 


0t  =  a9t-l  +  wt5 


2  2  2  2 

where  u  ^  AJ(0,a  )  ,  w^  n-  W(0,a  )  with  a  and  a  known  and  y  is  known, 
t  ut  w  uw  o 

The  sequences  {u^}  and  {w^}  are  assumed  independent.  Uncertainty  about 

A  A 

0q  is  described  by  a  normal  density  with  mean  0q  and  variance  Zq  which 
are  both  specified. 
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The  above  set  up  is  that  of  a  Kalman  filter  model  except  for  the 
fact  that  a  is  unknown.  Suppose  that  our  uncertainty  about  a  is 
described  by  p(a)  a  prior  distribution  for  a  given  some  background 
information.  Then  given  some  data  y(t),  where  we  recall  that 
y(t)  =  (y^,...,yt),  our  goal  is  to  make  inferences  about  9^  and  the 
future  observations  y  ^ *  Extending  consideration  to  a, 
the  posterior  distribution  of  0^  is 

P(0t|y(t))  =  /p<et|y(t),a)  p(a|y(t))da,  (4.2) 

where  p(0fc  |y (t)  ,a)  is  obtained  by  the  usual  Kalman  filter  solution 
with  a  assumed  known  and  p(a|y(t)),  the  posterior  distribution  of  a 
given  y(t),  is  obtained  via  Bayes  law  as 


p(a|y(t))  oc  L( a;  y(t))p(a),  (4.3) 

with  L( a;  y(t))  as  the  likelihood  function  of  a.  For  the  ordinary 
Kalman  filter  with  a  known,  the  predictive  distribution  of  y^  given 
y(i-l)  is 


p(y,  |y(i-l)  ,a)  =  f  p(y  .  |6  .  ,y(i-l)  ,a)p(6  |y(i-l)  ,a)d6  ,  (4.4) 

l  Ja  1  1  1  1 


where  (0 . I y (i-1) ,a)  'Vi  W  (a0  .  ..,R.)  where  R.  =  a2  Z.  1  +  a2;  the 

l1  1-1  1  1  1-1  w 

A  /s 

quantities  0_^  ^  and  ^  are  obtained  recursively  via  the  relationship 
(© | y (i— 1) >a)  ^  W(§i_1,  Si_1) .  Specifically, 


/N 

a 0.  .  a  +  R.  y,  y.  , 
i-2  u  l-l  'i-j/i-j 


'i-1 


yi-2 


Ri-1  + 


u 


(4.5) 
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and 


R.  ,cr2 

1-1  u 


(4.6) 


Ji-1 


2  2 

yf  o  R,  ,  +  a. 


1-2  i-1 


u 


It  now  follows  from  the  above  that 


(yjytt-l)  ,a)  ^  NCad^  y±_1>  y±_iR±  + 


(4.7) 


and  so  the  likelihood  of  a  may  be  written 

t 

L( a;  y(t))  =  II  p(y  |y(i-l)  ,a) 
i=l 


where  the  terms  in  the  product  are  determined  by  (4.7). 
We  may  now  write  (4,2)  as 

w/p(6t|y(t)  ,a)  L(a;y(t))p(a)da 


p(0t|y(t))  = 


fL(a  ;  y(t))  p(a)da 


(4.8) 


Any  reasonable  prior  distribution  of  a  that  we  may  consider  leads 
us  to  integrals  in  (4.8)  which  cannot  be  expressed  in  closed  form.  The 
same  is  also  true  when  we  consider  the  predictive  distribution  of  yt+^ 
given  y(t);  that  is,  the  ratio  of  the  integrals 


p(yt+il  y(t))  = 


/p(y t+i 


y(t),a)  L  (a;y  (t) )  p(a)da 


^l(a;  y(t))  p(a)da 


(4.9) 


One  way  to  deal  with  the  evaluation  of  the  integrals  (4.8)  and  (4.9) 
is  via  numerical  methods.  Another  way  is  via  an  approximation  due  to 
Lindley  (1980)  which  performs  well  when  t  <».  For  convenience,  we  give 
below  an  overview  of  Lindley fs  approximation. 
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Lindley  (1980)  develops  asymptotic  expansions  for  the  ratio  of 
integrals  that  occur  frequently  in  Bayesian  analyses.  He  considers  ratios 
of  the  form 

^w(a)  eL^da/ J * p(a)eL^a^da  (4.10) 

where  a  is  an  (unknown)  parameter  and  L(a)  is  the  logarithm  of  its 
likelihood,  with  dependence  on  y(t) ,  the  data,  being  suppressed;  that  is 


t 

L(a)  =  log  L(a;y(t))  =  Z  log  {p<y.  |y(i-l)  ,a) >. 

1=1 

d  ef  ' 

The  quantity  w(a)  p(a)  u(a)  and  u(a)  is  some  function  of  a  that  is 

of  interest.  For  example,  if  u(a)  **  a,  then  (4.10)  is  the  mean  of  the 
posterior  distribution  of  a. 

Lindley* s  approximation  is  concerned  with  the  asymptotic  behavior 
of  (4.10)  as  t  **•  00 .  This  is  facilitated  by  the  fact  that  asymptotically, 
L(a)  concentrates  around  a,  its  maximum,  assuming  that  the  maximum  is 
unique.  The  idea  is  to  obtain  a  Taylor fs  series  expansion  of  all  the 
functions  of  a  in  (4,10).  Let  H(a)  =  log  p(a);  then  (4,10)  can' also 
be  written  as 

/u(a)eL(a0+B<C°da/  (4. 

If  we  let  u(a)  =  p(6t|y(t),a)  [or  p(yfc+1 |y(t) ,a) ] ,  then  an 
approximation  to  (4.11)  is 


A 

u(a) 


U2  +  2U1HJ^  U1L3 


2Lr 


2l; 


(4.12) 
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where 


u. 

1 


d~*~u(g) 

da1 


/N 

a=a 


i-1.2 


tt  _  dH(a) 
H1  ' 


*  ,  and  L  = 

a=a  i  ,i 

da 


1=2,3. 


a=a 


A  convenient  prior  for  a  is  the  uniform  on  [a,b].  In  this  case 
H(a)  is  a  constant.  When  u(a)  =  E(0^ |y (t) ,a) ,  (4.12)  gives  us 
approximately  E(0t|y(t)),  the  optimal  adaptive  Kalman  filter  estimate. 
When  u(a)  =  p(yt+^ |y(t) ,a) ,  (4.12)  gives  us  approximately  p(yt+^ | y(t) ) . 
The  quantities  E (0  I y(t) , a)  and  p(y.  , |  y(t)  ,a)  are  given  by  (4.5)  and 
(4.7)  respectively.  To  obtain  E(y  ^ | y (t) ) ,  the  predictive  mean3  we 
set  u(a)  =  E(yt+1|  y(t)  ,a)  . 
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